Search CORE

217 research outputs found

Environment capturing with Microsoft Kinect

Author: Komura Taku
Mackay Kevin
Shum Hubert P. H.
Publication venue
Publication date: 01/01/2012
Field of study

Northumbria Research Link

Fast accelerometer-based motion recognition with a dual buffer framework

Author: Komura Taku
Shum Hubert P. H.
Takagi Shu
Publication venue: IPI Press
Publication date: 28/06/2011
Field of study

Northumbria Research Link

Crossref

Edinburgh Research Explorer

Illumination-Based Data Augmentation for Robust Background Subtraction

Author: Ho Edmond
Sakkos Dimitrios
Shum Hubert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2019
Field of study

A core challenge in background subtraction (BGS) is handling videos with sudden illumination changes in consecutive frames. In this paper, we tackle the problem from a data point-of-view using data augmentation. Our method performs data augmentation that not only creates endless data on the fly, but also features semantic transformations of illumination which enhance the generalisation of the model. It successfully simulates flashes and shadows by applying the Euclidean distance transform over a binary mask generated randomly. Such data allows us to effectively train an illumination-invariant deep learning model for BGS. Experimental results demonstrate the contribution of the synthetics in the ability of the models to perform BGS even when significant illumination changes take place

arXiv.org e-Print Archive

Northumbria Research Link

Crossref

Abnormal Infant Movements Classification With Deep Learning on Pose-Based Features

Author: Embleton Nicholas
Fehringer Gerhard
Ho Edmond
Marcroft Claire
McCay Kevin
Shum Hubert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/03/2020
Field of study

The pursuit of early diagnosis of cerebral palsy has been an active research area with some very promising results using tools such as the General Movements Assessment (GMA). In our previous work, we explored the feasibility of extracting pose-based features from video sequences to automatically classify infant body movement into two categories, normal and abnormal. The classification was based upon the GMA, which was carried out on the video data by an independent expert reviewer. In this paper we extend our previous work by extracting the normalised pose-based feature sets, Histograms of Joint Orientation 2D (HOJO2D) and Histograms of Joint Displacement 2D (HOJD2D), for use in new deep learning architectures. We explore the viability of using these pose-based feature sets for automated classification within a deep learning framework by carrying out extensive experiments on five new deep learning architectures. Experimental results show that the proposed fully connected neural network FCNet performed robustly across different feature sets. Furthermore, the proposed convolutional neural network architectures demonstrated excellent performance in handling features in higher dimensionality. We make the code, extracted features and associated GMA labels publicly available

Northumbria Research Link

E-space: Manchester Metropolitan University's Research Repository

Enlighten

DSPP: Deep Shape and Pose Priors of Humans

Author: Hu Shanfeng
Mucherino Antonio
Shum Hubert
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2019
Field of study

The prior knowledge of real human body shapes and poses is fundamentalin computer games and animation (e.g. performance capture). Linear subspaces such as the popular SMPL model have a limited capacity to represent the large geometric variations of human shapes and poses. What is worse is that random sampling from them often produces non-realistic humans because the distribution of real humans is more likely to concentrate on a non-linear manifold instead of the full subspace. Towards this problem, we propose to learn human shape and pose manifolds using a more powerful deep generator network, which is trained to produce samples that cannot be distinguished from real humans by a deep discriminator network. In contrast to previous work that learn both the generator and discriminator in the original geometry spaces, we learn them in the more representative latent spaces discovered by a shape and a pose auto-encoder network respectively. Random sampling from our priors produces higher-quality human shapes and poses. The capacity of our priors is best applied to applications such as virtual human synthesis in games

Northumbria Research Link

Crossref

Interaction-based Human Activity Comparison

Author: Ho Edmond
Shen Yi
Shum Hubert
Yang Longzhi
Publication venue: IEEE
Publication date: 25/01/2019
Field of study

Traditional methods for motion comparison consider features from individual characters. However, the semantic meaning of many human activities is usually defined by the interaction between them, such as a high-five interaction of two characters. There is little success in adapting interaction-based features in activity comparison, as they either do not have a fixed topology or are in high dimensional. In this paper, we propose a unified framework for activity comparison from the interaction point of view. Our new metric evaluates the similarity of interaction by adapting the Earth Mover’s Distance onto a customized geometric mesh structure that represents spatial-temporal interactions. This allows us to compare different classes of interactions and discover their intrinsic semantic similarity. We created five interaction databases of different natures, covering both two characters (synthetic and real-people) and character-object interactions, which are open for public uses. We demonstrate how the proposed metric aligns well with the semantic meaning of the interaction. We also apply the metric in interaction retrieval and show how it outperforms existing ones. The proposed method can be used for unsupervised activity detection in monitoring systems and activity retrieval in smart animation systems

Durham Research Online

Northumbria Research Link

Enlighten

Prior-less 3D Human Shape Reconstruction with an Earth Mover’s Distance Informed CNN

Author: Ho Edmond
McCay Kevin
Shum Hubert
Zhang Jingtian
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/10/2019
Field of study

We propose a novel end-to-end deep learning framework, capable of 3D human shape reconstruction from a 2D image without the need of a 3D prior parametric model. We employ a “prior-less” representation of the human shape using unordered point clouds. Due to the lack of prior information, comparing the generated and ground truth point clouds to evaluate the reconstruction error is challenging. We solve this problem by proposing an Earth Mover’s Distance (EMD) function to find the optimal mapping between point clouds. Our experimental results show that we are able to obtain a visually accurate estimation of the 3D human shape from a single 2D image, with some inaccuracy for heavily occluded parts

Northumbria Research Link

Crossref

E-space: Manchester Metropolitan University's Research Repository

Enlighten

3D Car Shape Reconstruction from a Single Sketch Image

Author: Ho Edmond
Morishima Shigeo
Nozawa Naoiki
Shum Hubert
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2019
Field of study

Efficient car shape design is a challenging problem in both the automotive industry and the computer animation/games industry. In this paper, we present a system to reconstruct the 3D car shape from a single 2D sketch image. To learn the correlation between 2D sketches and 3D cars, we propose a Variational Autoencoder deep neural network that takes a 2D sketch and generates a set of multiview depth & mask images, which are more effective representation comparing to 3D mesh, and can be combined to form the 3D car shape. To ensure the volume and diversity of the training data, we propose a feature-preserving car mesh augmentation pipeline for data augmentation. Since deep learning has limited capacity to reconstruct fine-detail features, we propose a lazy learning approach that constructs a small subspace based on a few relevant car samples in the database. Due to the small size of such a subspace, fine details can be represented effectively with a small number of parameters. With a low-cost optimization process, a high-quality car with detailed features is created. Experimental results show that the system performs consistently to create highly realistic cars of substantially different shape and topology, with a very low computational cost

Northumbria Research Link

Crossref

Enlighten

Saliency-Informed Spatio-Temporal Vector of Locally Aggregated Descriptors and Fisher Vector for Visual Action Recognition

Author: Organisciak Daniel
Shum Hubert
Yang Longzhi
Zuo Zheming
Publication venue
Publication date: 03/09/2018
Field of study

Feature encoding has been extensively studied for the task of visual action recognition (VAR). The recently proposed super vector-based encoding methods, such as the Vector of Locally Aggregated Descriptors (VLAD) and the Fisher Vector (FV), have significantly improved the recognition performance. Despite of the success, they still struggle with the superfluous information that presents during the training stage, which makes the methods computationally expensive when applied to a large number of extracted features. In order to address such challenge, this paper proposes a Saliency-Informed Spatio-Temporal VLAD (SST-VLAD) approach which selects the extracted features corresponding to small amount of videos in the data set by considering both the spatial and temporal video-wise saliency scores; and the same extension principle has also been applied to the FV approach. The experimental results indicate that the proposed feature encoding scheme consistently outperforms the existing ones with significantly lower computational cost

Northumbria Research Link